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On the Overhead of Interference Alignment: 
Training, Feedback, and Cooperation 
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Abstract 

Interference alignment (IA) is a cooperative transmission strategy that, under some conditions, 
achieves the interference channel's maximum number of degrees of freedom. Realizing IA gains, 
however, is contingent upon providing transmitters with sufficiently accurate channel knowledge. In 
this paper, we study the performance of IA in multiple-input multiple-output systems where channel 
knowledge is acquired through training and analog feedback. We design the training and feedback system 
to maximize IA's effective sum-rate: a non-asymptotic performance metric that accounts for estimation 
error, training and feedback overhead, and channel selectivity. We characterize effective sum-rate with 
overhead in relation to various parameters such as signal-to-noise ratio, Doppler spread, and feedback 
channel quality. A main insight from our analysis is that, by properly designing the CSI acquisition 
process, IA can provide good sum-rate performance in a very wide range of fading scenarios. Another 
observation from our work is that such overhead-aware analysis can help solve a number of practical 
network design problems. To demonstrate the concept of overhead-aware network design, we consider 
the example problem of finding the optimal number of cooperative IA users based on signal power and 
mobility. 

I. Introduction 

Interference alignment (IA) for the multiple-input multiple-output (MIMO) interference chan- 
nel is a cooperative transmission strategy that attempts to structure interfering signals such that 
they occupy a reduced dimensional space when observed at the receivers [UJ, [0. Alignment 
often enables achieving the maximum number of degrees of freedom (DoF) UJ, 0. Precoding 
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transmitted signals to carefully align them at the receivers, however, requires knowledge of the 
interfering channels in the system, collectively known as channel state information (CSI). Perfect 
CSI is assumed to be available when designing most IA algorithms HI, (3]|-[|5]| or reporting 
genie-aided IA gains. Practical systems, however, acquire receiver CSI with the help of training 
sequences or pilots J6j. Such CSI can then be shared with the transmitters via feedback. As 
a result, practical CSI is imperfect and comes with an overhead signaling cost, both of which 
penalize the effective data rates achieved. Realizing the gains of IA is therefore contingent upon 
providing systems with sufficiently accurate CSI at a manageable overhead cost. 

Several approaches have been proposed to fulfill IA's transmit CSI requirement 0-[]9]|, 
typically assuming perfect CSI at the receiver. The feedback strategy in Q proposes to use 
Grassmannian codebooks to compress and improve CSI feedback in single-antenna frequency ex- 
tended IA systems. The feedback strategy was then extended to multiantenna frequency extended 
systems in [H. Both and (8]| guarantee that limited feedback preserves the number of DoF by 
scaling the number of feedback bits with SNR, thus making codebooks prohibitively large [flOl . 
To overcome the problem of scaling codebook size, and relax the reliance on frequency selectivity 
for quantization, [@ proposed an analog feedback strategy for constant MIMO interference 
channels. Using analog feedback, a constant data rate gap from perfect CSI performance was 
shown, as long as the SNRs on the forward and feedback links are order-wise equal. A limitation 
of the analysis in GO-flU, however, is that the number of DoF remains the primary performance 
metric considered. IA's sum-rate performance at finite SNR, especially when accounting for the 
time spent on overhead signaling, has yet to be considered. 

Attempts to more directly analyze or reduce overhead are limited to |fTT T |-|fl4 ll . To analyze 
the effect of overhead, [flTI considers the effective number of spatial DoF of an IA system with 
training and feedback. By considering DoF, however, flTTI implicitly characterizes performance 
at infinitely high SNR. Alternatively, 021 reduces codebook size to limit overhead in limited 
feedback IA systems by leveraging temporal correlation without providing any overhead-aware 
analysis. In another line of work, information about the network topology is used to partition 
users into optimally sized alignment groups ffT3l . In [[T4|. IA is applied to partially connected 
interference channels. User grouping and partial connectivity, however, only reduces the number 
of channels that must be shared without suggesting an efficient training and feedback strategy. 

In this paper, we characterize the performance of a MIMO IA system that is designed 
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for perfect CSI operation yet only has access to imperfect CSI through training and analog 
feedback 0, [fT5l . lfT6ll . Thus, the performance demonstrated in this paper constitutes a lower 
bound for systems that are designed to be more robust to imperfect CSI through improved 
precoding strategies such as [|3T| for example. We adopt a block-fading model wherein the 
channel remains constant over the block length, and varies independently across blocks. In 
contrast with earlier work on IA with feedback, we precisely model channel selectivity by 
leveraging the relationship between block-fading and continuous-fading channels shown in [TTTll . 
This relationship allows us to define the concept of Doppler spread in a block fading channel 
and explicitly relate the size of the coherence block to that Doppler spread. Since both CSI 
acquisition and data transmission must now occur within the limits of a single coherence block, 
the IA system is faced with a non-trivial tradeoff: too much overhead leaves little time for 
payload data transmission, whereas too little overhead results in large sum-rate losses due to 
poor CSI quality [fT71 -[|2T | . In this paper, we design the training and analog feedback system to 
maximize IA's effective sum-rate, a non-asymptotic performance metric that accounts for both 
CSI quality and CSI acquisition overhead. CSI acquisition overhead is a fundamental concept 
that was largely neglected in earlier work on IA with imperfect CSI. 

We begin by giving a tractable expression for the IA sum-rate in genie-aided systems with 
perfect CSI, and extend the analysis under a general model for imperfect CSI. We then specialize 
our results to a system with training and analog feedback by characterizing CSI quality as a 
function of system parameters such as training overhead, feedback overhead and transmit power 
on both forward and reverse links. This results in a tractable expression for IA's effective sum- 
rate, which we proceed to optimize. To give a closed-form solution for the optimal effective 
sum-rate, we build on the method in lUTl and optimize a series expansion of the objective 
function. Initial results were reported in our previous work [22 1. In this paper, we complete IA's 
performance analysis by analytically characterizing its maximum achievable effective sum-rate 
and the corresponding optimum overhead budget. The main insights and conclusions that can 
be drawn from the effective sum-rate analysis can be summarized as follows: 

• Practical IA performance is not only a function of basic system parameters such as network 
size and SNR, but is tightly related to quantities such as Doppler spread, and feedback 
channel quality. Moreover, the dependence of both the maximum effective sum-rate, and 
the corresponding optimal overhead budget, on the various system parameters can be char- 
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acterized accurately. 

• By properly designing the training and feedback stages, IA can be made both feasible and 
beneficial in a wide range of fading scenarios, even when its relatively high overhead is 
considered. 

• Overhead-aware analysis is essential to the design of IA networks. As an example of this 
observation, we use the overhead analysis to give simple results on the optimal number of 
cooperative IA users for channels with varying levels of selectivity. 

Throughout this paper, we use the following notation: A is a matrix; a is a vector; a is a 
scalar; (•)* denotes the conjugate transpose; ||a|| denotes the 2-norm of a; \a\ is the absolute 
value of a; Ijv is the N x N identity matrix; CA/"(a, A) is a complex Gaussian random vector 
with mean a and covariance matrix A; (a±, . . . , a^) is an ordered set; E [■] denotes expectation. 



Consider the Ji-user narrowband MIMO interference channel shown in Fig. [T] in which 
transmitter i communicates with its paired receiver i and interferes with all other receivers, 
t 7^ i. For simplicity of exposition, consider a homogeneous network where all transmitters are 
equipped with Ap antennas and all receivers with Ar antennas, and each node pair communicates 
via d < min(Ar, Ar) independent spatial streams. The results can be generalized to a different 
number of streams or antennas at each node, provided that IA remains feasible ll23l . 

Assuming perfect time and frequency synchronization, the sampled baseband signal at receiver 
i can be written as 



where y, is the A R x 1 received signal vector, P is the transmit power, is the A R x 
At discrete-time effective baseband channel matrix from transmitter I to receiver i, F, = 



vector at node i such that E [SjS*] = Li, and Vj is a vector of i.i.d complex Gaussian noise 
samples with covariance matrix ct 2 1n r - The channels are assumed to be independent across 
users and each with i.i.d CJ\f(0, 1) entries. Large-scale fading can be included in the system 
model at the expense of a more involved exposition in Section [IV] 



II. System Model 




(1) 




. . . , fj d ] is transmitter i's A T x d precoding matrix, Sj is the d x 1 transmitted symbol 
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The received signal at transmitter i on the feedback channel is 

V iV R ^ v iV R 

where P F is the feedback power available such that Pp/P = 7, is the iV T x N R discrete 
time feedback channel between receiver I and transmitter i with i.i.d CjV(0, 1) entries, x j is 
the symbol vector with unit variance entries sent by receiver i, and v , is a complex vector of 
i.i.d circularly symmetric white Gaussian noise with covariance matrix ct 2 In t - The forward and 
feedback channels are assumed to be independent in the error analysis of Section [TV] i.e., a 
frequency division duplexed system or a general non-reciprocal system is assumed. 

We adopt a block-fading channel model in which channels remain fixed for a period, Tf rame , 
but vary independently from block to block. To model the effect of channel selectivity on IA 
performance, we set the block length to T frame = where / D plays the role of the block 
fading channel's effective Doppler spread. The definition of fn is motivated by the results in 
IfTTl showing a relationship between continuous fading and block fading systems. To enable 
IA over such a channel, both CSI acquisition and payload data transmission must occur within 
the coherence time T frame , or else the CSI acquired becomes obsolete. The IA system then 
encounters a well-known tension between CSI acquisition and data transmission ffTvTl — fl2TTl . and 
must allocate resources to each of the processes to optimize overall performance. 

To account for CSI acquisition overhead, and to accurately characterize the effective data rate 
achieved by IA, we adopt the overhead model shown in Fig. [2] In this model, overhead signaling 
consumes time resources that could otherwise be used for data transmission, i.e., CSI acquisition 
penalizes effective sum-rate. For such an overhead model, the effective sum-rate (in bits/s/Hz) 
can be written as [fT7Tl-[[T9l 

RcS (P, ToUd) = ( fra "^ ) Rswn{P, ToHd) (3) 

\ -L frame / 

where Tqhd is the total time spent on training and feeding back channels, and R sum (P, Tqhd) 
is the average sum-rate in bits/s/Hz achieved by IA on the channel uses allocated for payload 
transmission. Using ©, and previous insights into IA performance, we highlight the tradeoff 
between overhead signaling and data transmission. Increasing overhead improves CSI quality 
and in turn improves R sum (P, Tqhd), but the relative period over which R snm (P, Tqhd) can 
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be achieved shrinks. A similar tension exists when lowering overhead; less overhead allows 
more channel uses for data transmission but the sum-rate per channel use suffers due to poor 
CSI quality. The objective then becomes maximizing the effective sum rate given in © by 
optimally trading off overhead with data transmission [fT71 - ll2lTl . Throughout this paper, we treat 
R S um(P, Tqhd) as an information-theoretic quantity, and thus derive mutual information-based 
sum-rates achievable without errors. IA performance can also be analyzed from the perspective 
of fixed-rate transmission where metrics such as bit error rate may be of interest [|24|. 

III. Interference Alignment: An Average Sum-Rate Analysis 

This section derives the average sum-rate achieved by IA in both genie-aided networks where 
channels are known perfectly, as well as practical systems where CSI is imperfect. 

A. Interference Alignment with Perfect CSI 

IA often achieves the full number of DoF supported by MEMO interference channels. In cases 
where the full DoF cannot be guaranteed, IA has been shown to provide significant gains in high- 
SNR sum-rate 0, flU, ll25l . While this paper focuses on IA, even better performance could be 
achieved with other precoding algorithms that seek a balance between interference minimization 
and signal power maximization 0, ll26l . [|27l . The algorithms in 0, ll26l . E71 . however, do 
not readily lend themselves to average sum-rate analysis. 

To analyze IA sum-rates, we begin by examining the effective channels created after precoding 
and combining. For tractability, we focus on IA with a simple per-stream zero-forcing (ZF) 
receiver. Recall that in the high (but finite) SNR regime, where IA is most useful, gains from 
more involved receiver designs are limited. In such a system, receiver i projects its signal onto 
the columns of the zero-forcing combiner W, = [wj, . . . , w™, . . . , wf] which gives 



At the output of these linear receivers w™, the conditions for perfect IA can be stated as 




(k,e)ji(i,m) 



(4) 



(wD*H^ = 



(5) 



|(wD*H M fri > c> 0, 



Vz, m, 



(6) 
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where alignment is guaranteed by ©, and © is satisfied almost surely [Q]|, [HJ. 

As a result of conditions © and ©, the combination of IA and ZF effectively creates Kd non- 
interfering scalar channels. The maximum mutual information across these channels is achieved 
via Gaussian signaling which yields an instantaneous sum-rate given by 

i=l m=l \ 

To derive an expression for the average sum-rate, i.e., R sum = E [-R sum ], we first give the following 
lemma. 

Lemma 1 ( K9\ Appendix A]): The effective direct channels (w[ ra )*H i)i f, i m are independent and 
Gaussian distributed with unit variance if: (i) the precoders Fj are unitary and are generated by 
an IA solution that does not consider the direct channels H^j, and (ii) the combiners Wj are 
calculated to simply zero-force inter-user and inter-stream interference. 

The conditions Lemma 1 places on precoder and combiner calculation are satisfied by most IA 
solutions such as [QQ|, ff3]|-[[5]|. Hence, as a result of Lemma [H the scalar point-to-point channels 
created by the combination of IA and ZF experience Rayleigh fading. As a result, the average 
sum-rate can be written in exponential integral form as [28], ll29ll 

= Kdlog 2 (e)e l ^E 1 , (8) 

oo 

which is written as a function of the per-stream SNR, p = and £"1(77) = / t^e^dt is an 

1 

exponential integral. 

B. Interference Alignment with CSI from Training and Feedback 

When the channels are not known perfectly, interference cannot be aligned perfectly. Mis- 
alignment leads to "leakage interference", which reduces the signal-to-interference-plus-noise 
ratio (SINR) in the desired signal space. Moreover, imperfect knowledge of the direct channel 
implies that receivers will perform mismatched decoding [1301 . again reducing effective SINR. 
In this section, we examine the effect of imperfect CSI on the performance of an IA system that 
is optimized for perfect-CSI operation, i.e., a system that does not consider CSI imperfection in 
its design. Thus, the performance results demonstrated in this paper can be improved upon by 
adopting precoding algorithms that are more robust to CSI errors such as [|3T1 . 




A" 



=1 m=l 



log 2 



a 2 
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Consider an IA system in which transmitters use a common set of channel estimates as input to 
an IA solution such as HI, S-JSl, i.e., they calculate imperfect IA precoders F$ and combiners 
Wj. Denote the channel estimates as and the corresponding error as = H.^ — Hj^. In 
this system, the IA solution satisfies 

(wD*Hi >fc ?f = 0, W(k,£)^(t,m) (9) 



'w'")-ii,,r/ 



>c>0, Vz',m. (10) 



We assume receivers obtain perfect knowledge of the combiners Wj and the imperfect effective 
direct channels w^H^f™ for detection, an assumption similar to [ED-[|9], flED, fl32lF whose 
relaxation is a topic of future work. In general, receiver side information about the effective 
channels can be acquired blindly ll33l or via additional training or silent phases lfl6l . For such 
an IA system, the received signal after projection is 

= V? ^TTK^sT + \R E(^ m )*H iife ?f4 + (wD*v,, (ii) 

k,i 

where we have used the fact that conditions © and (flOl) are satisfied, thus (w™)*!!^/.^ = 

(wr)*(H i)fe + H i:fe )?| = (wr)*H ilfc ?|. 

Analyzing the maximum sum-rates achievable on the channel in (fTD) is in general difficult, as 
it requires optimizing the distribution of the input symbols Sj for the interference channel in (flTT ). 
Recall, however, that our objective is to analyze a system optimized for perfect CSI operation, i.e. 
one that does not account for CSI imperfection. This enables making the following assumptions 
that would be expected from a system optimized for perfect CSI operation. 

Assumption 1: Transmitters use a typical Gaussian codebook made up of i.i.d. symbols to form 
the symbol vectors Sj. Such a signaling codebook, which was optimal for the interference free 
channels created by IA with perfect CSI, may no longer be optimal now that CSI is imperfect. 

Assumption 2: Receivers perform nearest neighbor decoding using the estimates H^. Nearest 
neighbor decoding would again be optimal with perfect CSI. The nearest neighbor decoder, the 
channel estimates and the signaling codebook together satisfy the conditions outlined in ll30l for 



'in fact (7], Q3, 1321 place a stronger assumption summarized by the receivers' knowledge of the exact imperfect CSI known 
to the transmitters. The two assumptions are functionally equivalent from the perspective of the sum-rate analysis, i.e., all that 
is needed is the receivers' knowledge of w™ and of the scalars w^Hj^f™. 
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Corollary 3.0.1 of [1301 to hold with equality, meaning that the estimation error plays the role of 
an additional source of Gaussian noise irrespective of its actual distribution. 

Under Assumptions [T]|2l and combining the results of ||3~0l and ||34l , the average sum-rate 
achieved can be written as 

\ 



A" 



FL 



=1 m=l 



log 2 



V 



W 



r tt err 



:wr)*H iifc f| 



(7 Z 



/ 



(12) 



where we note that the outer expectation is now only over the fading on the direct channel and 
not the interference. Therefore, the leakage interference terms (w™)*Hj fc f^ indeed play the role 
of independent sources of additive Gaussian noise, regardless of their distribution. 



When the entries of Hj k are zero-mean and uncorrected with a variance of a%, it follows that 



E 







2" 


p 

d 


(wr)*H iife f* 





a^, thus the denominator in f|T2]) is simply KPo\ + a 2 . Moreover, 



if the estimates & are MMSE estimates of Hj k, the entries of Hj k have a variance of 1 — o\ . 



H 



This results in an effective average SINR that can be written as 

p(i - 4) 



PcS 



P Kda 2 ~ + 1 ' 



(13) 



where p is the per-stream SNR defined in ©. If the estimated direct channels is Gaussian, 
the average sum-rate achieved by IA with imperfect CSI is again given in exponential integral 
form as 

iWPcfr) = Kd\og 2 (e)e 1 /^E 1 (—) . (14) 

\PcsJ 

To evaluate sum-rate achieved by IA, one must now characterize /? cff or equivalently a%. In 
Section [IV] we specialize our result for a system with training and analog CSI feedback and later 
optimize IA's effective data rate with overhead in Section [V] 



IV. Training and Analog feedback 

We propose to split the acquisition of CSI at the transmitter into three main phases. First, 
the transmitters train the forward channels via pilots. Second, the receivers train the feedback 
channels via pilots, setting the stage for the forward transmitters to estimate the feedback 
information in the next stage. Finally, the receivers feedback information about the forward 
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channels in an analog fashion, i.e., as unquantized complex symbols. We can characterize the 
CSI error introduced in the CSI acquisition phase by examining the three stages. 

A. Forward and Feedback Channel Training 

In the first training phase, each transmitter k sends an orthogonal pilot sequence matrix 
i.e., *&i*&* k = SikiN T , over a training period r t ll35l . Pilot orthogonality imposes the constraint 
r t > KNt- Each receiver % then observes the iV R x r t matrix 

r~p K 

v k =i 

where V, is an Nr x r t matrix of noise terms. Using Yj, receiver % calculates an MMSE estimate 
of its incoming channels H ijfc VA; given by 

fnP 

where the superscript (-) r emphasizes that H r ik are the channel estimates gathered at the receiver 
before they are relayed back to the transmitters and further corrupted. At the output of this first 
training stage, the channel estimates H? fe have i.i.d. CM(0, a 2^pJ NT ) entries with corresponding 
errors %, ~ CM(0, ^ p/Nt ). 

The feedback channel training phase proceeds similarly. Namely, the receivers transmit orthog- 
onal pilot sequences over a training period r p > KNr. The transmitters independently compute 
MMSE estimates of their incoming channels, resulting in estimates G^i ~ CM ( 0, 

with corresponding error terms G k , ~ CM ( 0, a l ' P 



B. Analog Feedback 

After forward and feedback channel training, the receivers feedback their channel estimates 
H[ fe in an analog fashion during a feedback period Tf. This is accomplished by first post- 



multiplying each x K N? feedback matrix 



h;;,...h; 



K 



with a KNt x T{ matrix such 



that ^i^l = 5 i: klKN T flU, lfl5l . The spreading matrices orthogonalize the feedback from 
different users and facilitate estimation. This orthogonality constraint requires that Tf > K 2 N^. 



1 1 



The transmitted N R x r f feedback matrix from receiver i can be written as [0, [fT31 



TfP F 



(17) 



KN T N R \a 2 + r t P/N T 
where the leading scalar is to ensure that the average transmit power constraints are satisfied 



with equality, i.e., one can verify that E 



trace 



TfPp. We write the concatenated 



KN T x Tf matrix of feedback symbols observed by all transmitters as 



TfP F 



TtP/Nj 



KN T N R \a 2 + r t P/Nj 



-1 K 



8=1 



G 



LK 



Hr Tjf 
i,l ■ ■ ■ 



(18) 



where V is the KN T x Tf matrix of i.i.d Gaussian noise. 

To simplify the performance analysis, we make the same assumption as in 0: at the end of the 
feedback phase, the transmitters cooperate by sharing their rows of the received feedback matrix 
Yf which enables them to form a unified estimate of the forward channels Hj fc. We refer the 
reader to [|9) for a discussion of this cooperative assumption and for alternative non-cooperative 
approaches that are shown to perform close to this special case. 

Under this cooperative assumption, the transmitters estimate H^- VA; by first isolating the 
feedback sent by receiver i. They post-multiply their received symbols by ^* to compute 



r f P F 



nP/N T 



KN T N R \a 2 + r t P/N T 



-i 



Gi x 



Hr 
LK 



(19) 




The transmitters then compute a common linear MMSE estimate of the forward channels Vz, k 
using their feedback channel estimates G^*,. Vz, k, and assuming that KN? > N R so that the 
estimation problem is well posed. After a lengthy yet standard application of the orthogonality 
principle and the matrix inversion lemma, the MMSE estimate is given by 



H, 



KN T N R ( r t P/N^ 



nP f 



a" 



T t P/Ni 



G*Gj + 7iG*Gj + j 2 I 



) _1 G*^ f **, 



(20) 



where we have written (l20l in terms of H 



Hj i, 



H 



i,K 



Vz, the concatenated estimate 
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of the channels Hj = [H^, . . . , H^x] Vz, for the sake of notational brevity. The constants 7! 
and 72 are the MMSE regularization factors. For completeness, 71 and 72 are given by 

7l = , 72 = 1 + 7T n + -5-; r. /*r • ( 21 ) 



Pr t ' ,z V A nPv o 2 + r p P F /iV R 

In essence, 71 captures the effect of the noise in the transmitted estimates H[ fe , while 72 captures 
the effect of the noise in the estimates Gj^ as well as the noise observed during feedback. 

Having formalized the three training and analog feedback stages, we now analyze the variance^! 
cr~, of the CSI error Hj fc — H i)fc , which automatically yields an estimated CSI variance of 
1 — cr~. Unfortunately, writing a% exactly yields rather cumbersome expressions. For this reason, 
we replace the variance of the MMSE estimation error by that of a zero-forcing estimator in 
a manner similar to [[161 , [|36l . This ZF simplification intuitively amounts to deriving a high 
SNR result [161 and mathematically amounts to neglecting the constants 7! and 72 ; recall that 
moderately high SNR is after all the main operating region of interest for IA. Numerical results 
in Section |VT] will demonstrate that the effect of this simplification is negligible. 

By neglecting 7! and 72, and after some algebraic manipulation, we find that the error Hj = 
Hj — Hj at the end of the three training and feedback phases can be written as 



H, 





H[+(G*G.) GU X I1 + ^GM: + J^^V% 



(22) 



As can be seen from (1221) . the resulting CSI error is a combination of three terms: the first due 
to forward channel estimation error H[, the second due to feedback channel estimation error 
Gj, and the third due to feedback noise. 

To derive the statistics of Hj, we note the following three facts about the three terms in (l22l) : 
1) The entries of H[ are uncorrelated CM ( 0, °" 2 P ) variables as shown in Section IIV-A1 



2) Similarly, the entries of Gj are CM I 0, u TvPv I implying that GjH-' has independent 

2 V ° +J k^J 
entries with variance equal to ^pJn^ - 

u Nr 

3) The entries of V are uncorrelated CM(0, o ) variables and so are the elements of V^* 
since the matrix ^ is unitary. 



2 We in fact derive the entire covariance matrix for the columns of H^* — H;,fc. We show that the covariance matrices are 
scaled identities and thus the second order statistics of the error are entirely described by a scalar variance. 
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Combining the properties stated, the conditional covariance of each column of Hj denoted , 
conditioned of G, is Il9l, ifBI 



EtH^HpIG, 




g;g, 



(23) 



Since the entries of the MMSE estimate Gj are Gaussian, the matrix ( G*G 



has an inverse- 



Wishart distribution [37]. Moreover, since Gj has uncorrelated entries with a variance of 



V R 



G*G,J has a covariance matrix equal to a properly scaled identity Q9], QT5J], 0711 . Thus 
marginalizing (1231) over G,, we find that Hj has independent columns with scaled identity 
covariance matrices with diagonal entries given by 



a 



N T a 2 



H 



+ 



a 



N R 2 | KN T N R 



1 



(24) 



r t P ' (KN T - N R )P F V r p ' r f V" ' r p P F 

The same high SNR simplification adopted earlier to replace MMSE estimation error by ZF 
estimation error, however, allows us to further simplify d24l) by writing 



4 



N T a 2 



N R 2 | KN T N R 



(25) 



r t P P(KN T - N R ) \ 7 r p 7 r f 

which completes the characterization of the distortion introduced by training and analog feedback. 

Note: Finally, a word on applying the results of Section [Til] to the analog feedback system 
described. First, we note that the analog feedback system satisfies Assumption [2l and the 



estimates yield E 



%a 2 ~ and E 

d H 



i P - 4) 



as 



needed. One subtlety though is that the fading on the feedback channel introduces non-Gaussian 
terms into the estimates H^, yet (fl4"l) is only exact when the estimates are truly Gaussian. For 
fairly accurate estimation, however, Hj j can be well approximated by a Gaussian. Moreover, it 
will be clear from the results of Section |VI] that the effect of this is negligible. 



V. Optimizing Overhead and Effective Sum-Rate 

Having formally quantified IA sum-rate as a function of SNR and CSI quality, and character- 
ized CSI quality in terms of training and feedback resources, we redefine both the optimization 
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problem and objective function as 

TC(P) = max f Tframe ~^ t + rp + rf) ) 

T "t>Tp> r f V 1 frame 

where we have used (■)* to denote optimality. We note from (|26l ) that p cS depends on Og and 



thus on r t , r p , and Tf. The problem in (1261) can be rewritten in a more tractable form as [[191 

(27) 



< ff (P)= max 

Qmin<a<l 



a) max -R SU m(Peff) 

Tt, Tp, Tf 
Tt+Tp+Tf=aTf ri 



rramc 



where a min = K(Nt + N R + KN-y) /Tf raiXac and is dictated by the minimum number of training 
and feedback symbols needed to render the estimation problems in Section [IV] well defined. The 
inner maximization in (|27T) optimizes sum-rate for a fixed overhead length of Xohd = enframe 
and the outer maximization finds the optimal a thereby completing the solution. 
Since -R sum (p c fj) is decreasing in cr~, the inner maximization step simplifies to 

2 * . N T a 2 a 2 fKN T N R N R 2 \ 

a~ = mm 1 ; 1 

n, T P , T f Tt P P(KN T - N R ) V 7Tf ir p J (2g) 

s.t. r t + r p + r f = aT frame . 

Although (|28T) is an integer problem, its continuous relaxation is convex. Applying standard 
convex optimization techniques, the Lagrangian for the inner maximization is 

^ N T a 2 a 2 /KN T N R N R 2 \ w 
A r t , t p , r f , A = + — — — ^ + -5- + A r t + r p + n - aT frame . (29) 

r t p p(knt - Nr) v m jt p j 

Solving for the first order KKT conditions, we obtain the optimal training and feedback times 
as a function of the total overhead budget aTf ramc as 

y/7N T (KN T -N R ) *_^ T * - V KN ? N * T 

T~t Oil frame; 7~p Oil frame; Tf Oil frame) 

fl [I [I 

where /x = ^^N^iKN^ — N R ) + N R + y/KN T N R . After solving the problem's continuous re- 
laxation, convexity implies that for any given feasible overhead budget aTf ramc simply examining 
the few integer neighbors of the points r t *, r p *, rf* yields the integer training and feedback times 
that minimize CSI distortion, i.e., optimal integer training and feedback times can be found by a 
simple search over the grid neighbors of the non-integer solution. Proceeding with the continuous 
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relaxation, the minimum CSI distortion for an overhead budget aT frame is 

a 2 (VKN T N R + N R + ^ 7 N T (KN T - N R )) * 

a 2 ~ = — (30) 

lP(KN T - iV R )aT frame 

Having found the optimal allocation of r t , r p , and Tf for a fixed overhead budget, what 
remains is to optimize the budget itself. The outer optimization in (T26l) . however, does not admit 
a closed form solution. To circumvent this problem, prior work on single user and broadcast 
channels has specialized their results to the limiting high or low SNR regimes [fT8l . fl20"l . relied 
on numerical optimization [0, or resorted to characterizing the scaling of overhead with various 
system parameters based on sum-rate lower bounds lfl9l . To give accurate results on finite-SNR 
sum-rate, we propose to optimize a series expansion of © with respect to the channel's Doppler 
spread around the point /d = ifTTl . Recall that Tf rame which we have been using thus far is 
related to /d by the relationship Tf rame = jj^. To that end, we give the following result on the 
series expansion of R e s(P, T hd)- 

Proposition 1: The effective sum-rate achieved by IA with training and feedback expands as 



-Rsum(p) 2/3- 



R cS (P,Ton,)Hl-a)(l + P K d ) [i + pKd ^ 

+ (^j (iW/>)(l + pKd) + 2KdR snra {p)) 



(31) 



0(U 



where 



(VKN T N R + N R + ^ 1 N T (KN^ - N R ) 

P = TWn (32) 

>y(KN T - N R ) 

whereas R sum (p) and R SU m(p) are the first and second derivatives of perfect CSI sum-rate, 
Rsum(p), which can be conveniently expressed as 

i? sum (p) = I (tfdlog 2 (e) - , R sum{p) = _i_ (Vdlog 2 (e) + i? sum (p) - 2^^-) . (33) 

p V p J p- V p / 



Proof: Given in Appendix |Al ■ 
Thus, by expanding effective sum-rate w.r.t. fr>, we have transformed the complicated non- 
linear dependence of effective sum-rate on system parameters such as P, T framc , / D , and Tohd to 
a simpler polynomial dependence. The expansion in Proposition [Dean now be used to derive the 
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expansion of the optimal overhead budget, a*, along with the performance it achieves. Relaxing 
the constraint that the overhead fraction a must be rational, simply differentiating the series 
expansion in Proposition \T\ and equating it to zero yields the optimal overhead budget a*. 

Proposition 2: The optimum overhead fraction a* for an IA system with training and analog 
feedback expands as 



a* 



' 2fll + ^QiWp) _ P + pKd) + 2Kd ) /d + 0(/d 3/ 2); (34) 



which results in the optimal effective sum-rate 



R* cS (P) =R swl (p) - 2y ^(1 + pKd)R sum (p)R sum (p)f D + 0(/ D ). ( 35 ) 

Note that if /□ is large enough that a* < a min the optimal overhead budget must be adjusted to 
a m - m and the expression for R* S (P) correspondingly updated. 

Proof: The proof follows directly from differentiating the expansion in Proposition \T\ w.r.t. 
a and solving the resulting cubic polynomial for its relevant root. ■ 
Therefore, Proposition|2]along with the solution to (1281 ) gives the effective sum-rate-maximizing 
amount of forward training, feedback channel training, and analog feedback as simple functions 
of fundamental system parameters such as SNR, Doppler spread (equivalently Tf rame ), and 
perfect CSI sum-rate. Numerical results in Section |VI] will show that the overhead expansion in 
Proposition [2] is accurate for a wide range of system parameters and can thus obviate the need 
for numerical overhead optimization. Furthermore, the derived results allow us to draw several 
interesting insights into IA system design and performance: 

1) The optimal overhead budget a scales with vTd- As stated, for high enough Doppler 
a*, must be adjusted to a m ; n meaning that overhead subsequently increases with This 
scaling behavior is in line with previous results on other single and multiuser channels. 

2) The sum-rate penalty due to overhead and imperfect CSI behaves similarly, i.e., increases 
with v7d initially and with /d at high Doppler. 

3) Examining the leading term in a* we note that, similarly to [[T71 , the term (1+pKd) -? sum ^ 
behaves like Kdj log e (l + p) and thus the optimal overhead budget decreases with SNR 
roughly as \JKd~j log e (l + p). 

4) Since overhead decreases with SNR, a minimum overhead interval of K N T + K N R + K 2 N T 
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is always optimal at sufficiently high SNR. Thus, the effective number of spatial DoF 
achieved by IA with the analog feedback strategy described is ( 1 - kn t+kn r +k 2 n t n r \ Kd ^ 

V frame / 

i.e., the DoF penalty increases linearly with / D . 

5) Again examining the leading term in a* we note that it increases with y/fi. Recalling 
the definition of (3 in ([32]) . we conclude that the optimal overhead budget increases with 
\J P/Pf- This formalizes the relationship between overhead and feedback link quality. 
In addition to highlighting the dependence of overhead and effective sum-rate on various system 
parameters, the derived results can provide simple answers to various network design questions. 
For example, by simply comparing IA's effective sum-rate expression to those achieved by 
other transmission strategies, one can choose the optimal transmission strategy for a given 
fading environment. Moreover, since overhead and channel selectivity have been shown to place 
fundamental limits on the gains of cooperation in wireless network ll38l . the overhead-aware 
analysis presented in this paper can help in determining the optimal number of cooperative IA 
users at a given level of selectivity. 

Consider, as a simple example, a K-user single-stream cooperation cluster with a variable 
number of antennas in which extra users are allowed to cooperate via IA if they do not incur a loss 
in effective sum-rate, else the extra users are not allowed access to the propagation medium and 
presumably left to transmit on a separate channel. In this model, additional cooperating users can 
be incorporated into the cluster as long as X*k+i(P) —Z*k(P) > where we have made cluster 
size explicit in the effective sum-rate subscript. Consequently, the effective sum-rate-maximizing 
cluster size becomes the smallest K such that I* K+1 (P) —Z* K (P) < 0. Moreover, note that 
minimizing overhead and maintaining IA feasibility imposes the constraint N T +N R = K+l [|23l . 
Thus, writing and in terms of K, e.g. iVr = \{K + l)/2], the user admission rule can 
be simplified to a function of only K, / D , SNR, and 7. To simplify the user admission rule even 
further, we make the following approximations: (i) we consider the leading term in R cS (P, Tqhd) 
thus focusing on IA's effective DoF given in the fourth observation after Proposition |2l (ii) we 
assume that N T = (K + l)/2 and thus relax its integer constraint. Using these simplifications, 
the user admission rule X* K+ x(P) — Z* K (P) > simplifies to 

4K 3 + 15K 2 + 17 K + 6 < — , (36) 

Jd 
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i.e., a K-user cluster can be extended to K + 1 as long as (|36l) is satisfied. Interestingly, this 
implies that in such single-stream IA scenarios the effective sum-rate-maximizing cluster size 
grows with /b 3 - While the approximate admission rule is a rather simplified version of 
X* K+ i(P) —Z* K (P) > 0, we show in Section |VH that it is very accurate at predicting optimal 
cluster size. Finally, we note that while we provide this example to illustrate problems that can 
be solved using our analysis, the rule in (|36l is by no means universal. When parameters such 
as large-scale fading or uncoordinated interference are considered, both the analysis and the 
admission rule must be adjusted. 

VI. Simulation Results 

Consider a three-user IA cluster with two transmit antennas, two receive antennas, and one 
spatial stream per user and let 7 = ^ = 1. Fig. [3] shows the effective sum-rate achieved by 
IA in systems with various levels of mobility or normalized Doppler spreads, /d. To quantify 
the degradation in effective sum-rate caused by overhead and imperfect CSI, we include the 
performance of a baseline genie-aided system in which CSI is both perfect and free. Fig. |3] 
indicates that IA achieves good performance in a system with vehicular-levels of mobility. In fact, 
if typical wireless parameters are adopted, such as a wavelength of A = 0.15 m (corresponding 
to a carrier frequency of 2 GHz), a coherence bandwidth of Wc = 300 kHz, and a normalized 
Doppler given by / D = where v is the user's velocity, Fig. |3] indicates that IA could 
theoretically perform well even at a speed of more than 160 km/hr. The rate of performance 
degradation over a wider range of Doppler spread can be seen in Fig. |4] Both Figs. [3] and |4] 
indicate that the analytical results of Section |V] are very effective in optimizing the effective sum- 
rate of IA systems as the resulting performance closely matches that of a numerically optimized 
system. Finally, Fig. [3] indicates that the effect of the simplifying assumptions made in Section 
lIVl is negligible since the effective sum-rate predicted by the derived rate expressions closely 
matches simulated IA performance. A very slight deviation is noticed at very low SNR where 
the ZF simplification in Section [IV] is a less accurate approximation of MMSE performance. 

Fig. \5\ shows the optimal overhead budget for systems with varying frame lengths and again 
includes both the analytical overhead budget from Section [V] as well as the result of numerically 
optimizing the same system. Fig. [5] confirms that Tqhd increases with frame size T framc at a 
rate proportional to \/Tf rame . Thus a* indeed decreases with -y= — , as shown in Fig. [6l or 
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equivalently increases with v7d (and with / D for sufficiently high Doppler). Fig. |5] also shows 
that the expansion in Proposition |2] provides an accurate characterization of IA's effective sum- 
rate-maximizing overhead budget over a wide range of SNRs and frame sizes. Fig. [7] in turn 
verifies the decrease of a* with SNR, which as stated in Section |V] follows the relationship 
a* ~ ^/log e (l + p)" 1 . To complete the characterization of overhead and effective sum-rate, Fig. 
[8] quantifies the deleterious effect of a weak feedback channel on overhead and effective sum-rate. 
Fig. [8] also indicates that the expansion results of Section |V] could significantly underestimate 
a* in very weak feedback channels, though the final effect on throughput remains limited. 

Finally we examine the efficiency of our overhead analysis in further network design. We 
consider the motivating example given in Section |V] of a K-user system for which we seek to 
optimize the cooperation cluster size as a function of mobility. Fig. [9] shows the optimal cluster 
size as a function of Tf rame for an IA system at 35 dB SNR. Fig [10] shows the corresponding 
effective sum-rate achieved. We plot the cluster size and effective sum-rate resulting from (i) an 
exhaustive search over all possible K, and (ii) the simple overhead-based user admission rule 
in (l36l) . We note that the cluster size predicted by the two methods are in close agreement, and 
that the asymptotic cube-root relationship predicted in Section |V] between optimal cluster size 
and Tframe is quite accurate even for small values of K. While the overhead-based rule tends to 
underestimate cluster size for small intervals of T framc , Fig. [10] indicates that the resulting rate 
gap from optimal sizing is negligible. The same can be said about the rate loss when applying 
the same overhead based rule to a system at an SNR of 10 dB and a system with 7 = 1CT 2 . 

VII. Conclusion 

We considered IA's effective sum-rate in practical systems where CSI is imperfect and comes 
with an associated overhead cost. We showed that training and feedback overhead can be 
optimized to ensure good IA performance over a wide range of SNR and Doppler spread. We 
quantified the dependence between overhead and various system parameters such as feedback link 
quality. More sophisticated precoding algorithms, designed to be robust to imperfect CSI, could 
further improve the demonstrated performance and thus remain a promising area for future work. 
The derived results provide a formal method to gauge true IA performance vs. other transmission 
strategies, and can thus highlight settings under which IA provides tangible gains. The derived 
analysis can also be used for further network design as demonstrated by the motivating example 
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given at the end of Section |V] on overhead-aware user admission and optimal network sizing. 



Appendix A 
Proof of Proposition [TJ 
To expand effective sum-rate around /b = 0, we start computing its first order derivative 



9Res{P, Tqhd) 

dfu 



(1 - a)R sum (p)^^\f D = 



28 

(1 - a)R sum (p)(l + Kdp)-f- 

aa 



(37) 



where |^|j D=0 is evaluated by noticing that after solving the inner maximization in (|28l ) and 



obtaining a~* in (|3Q|) we have g^|/ D =o = The term R sura (p) can be obtained by a standard 
derivation of the exponential integral rate expression in ([8]) w.r.t p and is given directly in the 
statement of Proposition [Q -R SU m(p) is obtained similarly. As for the second order term, we have 



d 2 R cS {P,T ow ) 
dfo 2 



(1-a) 



0) 



(6) 



(1-a) 



Rsum(p) 



Rsum(p) 



dp c S 

dfu 
( dpctf 




9 4 



9/r 



I/d=0 



(1 - a) (^) (1 + pifd) (4um(p)(l + pA'd) + 2KdR sum (p) 



(38) 



where (a) expands for clarity and (b) is by noticing that 8 " 2 = o since cr^* is linear in 
/d and otherwise replacing the values of the different variables. Combining (|37T) and (|38|) we 
get the resulting second order expansion. Higher order expansions can be found if additional 
accuracy is needed, however, the second order expansion is in general sufficient. 
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Fig. 1. A'-User MIMO interference channel model 
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Fig. 2. The overhead model adopted in which training and feedback consume resources that would otherwise be used for data 
transmission. 
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Fig. 3. Effective Sum-Rate vs. SNR for systems with different normalized Doppler spreads. This quantifies the loss in sum-rate 
due to both imperfect CSI and overhead and shows that the performance predicted by the analytical results presented is an 
accurate representation of optimal performance. 
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Fig. 4. Effective Sum-Rate vs. Normalized Doppler for IA systems at different SNR levels. This quantifies the degradation in 
sum-rate as mobility increases resulting in an increased overhead penalty. 
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Fig. 5. Tohd vs. Tframe. This confirms that the optimal value of Tohd scales with V^Same as predicted, and shows that 
optimizing a series expansion of the objective yields remarkably accurate results. 
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Fig. 6. a* vs. Tframe- This confirms that the optimal value of a scales with 



and thus scales with v/d as predicted 
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Fig. 7. a* vs. SNR. This shows the decrease of the optimal overhead budget with SNR. As stated in Section [V] it can be 
shown that the decrease is logarithmic with SNR. The figure also demonstrates that our expansion-base results are very accurate, 
deviating only slightly in high-SNR high-mobility scenarios. 
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Fig. 8. This figure shows the relationship between a* and R* S (P) with the feedback channel's relative quality for a system 
with T[ ramc = 10 4 . Plot (a) verifies the increase of overhead with I/7, when plot in linear scale the square root rate of increase 
can be verified. Plot (b) verifies the rate of decrease of optimal effective sum-rate with feedback link quality. 
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Fig. 9. Optimal Cluster Size vs. Tf ramc . This shows the optimal number of users to coordinate via IA which increases channels 
coherence time. This also shows that comparing overhead, i.e., overhead based selection, provides accurate decisions on optimal 
cluster size. 
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Fig. 10. Effective Sum-Rate with Cluster Size Optimization vs. Tf ramo . This shows the increase in effective sum-rate as a 
function of Tframc when the cluster size is chosen to maximize rate. This also quantifies the minimal sum-rate loss due to 
sub-optimal overhead-only based cluster sizing. 



